36 research outputs found
Scaling Machine Learning Systems using Domain Adaptation
Machine-learned components, particularly those trained using deep learning methods, are becoming integral parts of modern intelligent systems, with applications including computer vision, speech processing, natural language processing and human activity recognition. As these machine learning (ML) systems scale to real-world settings, they will encounter scenarios where the distribution of the data in the real-world (i.e., the target domain) is different from the data on which they were trained (i.e., the source domain). This phenomenon, known as domain shift, can significantly degrade the performance of ML systems in new deployment scenarios. In this thesis, we study the impact of domain shift caused by variations in system hardware, software and user preferences on the performance of ML systems. After quantifying the performance degradation of ML models in target domains due to the various types of domain shift, we propose unsupervised domain adaptation (uDA) algorithms that leverage unlabeled data collected in the target domain to improve the performance of the ML model. At its core, this thesis argues for the need to develop uDA solutions while adhering to practical scenarios in which ML systems will scale. More specifically, we consider four scenarios: (i) opaque ML systems, wherein parameters of the source prediction model are not made accessible in the target domain, (ii) transparent ML systems, wherein source model parameters are accessible and can be modified in the target domain, (iii) ML systems where source and target domains do not have identical label spaces, and (iv) distributed ML systems, wherein the source and target domains are geographically distributed, their datasets are private and cannot be exchanged using adaptation. We study the unique challenges and constraints of each scenario and propose novel uDA algorithms that outperform state-of-the-art baselines
Data4Good: Designing for Diversity and Development
We are witnessing unprecedented datafication of the society we live in, alongside rapid advances in the fields of Artificial Intelligence and Machine Learning. However, emergent data-driven applications are systematically discriminating against many diverse populations. A major driver of the bias are the data, which typically align with predominantly Western definitions and lack representation from multilingually diverse and resource-constrained regions across the world. Therefore, data-driven approaches can benefit from integration of a more human-centred orientation before being used to inform the design, deployment, and evaluation of technologies in various contexts. This workshop seeks to advance these and similar conversations, by inviting researchers and practitioners in interdisciplinary domains to engage in conversation around how appropriate human-centred design can contribute to addressing data-related challenges among marginalised and under-represented/underserved groups
Libri-Adapt: A New Speech Dataset for Unsupervised Domain Adaptation
This paper introduces a new dataset, Libri-Adapt, to support unsupervised
domain adaptation research on speech recognition models. Built on top of the
LibriSpeech corpus, Libri-Adapt contains English speech recorded on mobile and
embedded-scale microphones, and spans 72 different domains that are
representative of the challenging practical scenarios encountered by ASR
models. More specifically, Libri-Adapt facilitates the study of domain shifts
in ASR models caused by a) different acoustic environments, b) variations in
speaker accents, c) heterogeneity in the hardware and platform software of the
microphones, and d) a combination of the aforementioned three shifts. We also
provide a number of baseline results quantifying the impact of these domain
shifts on the Mozilla DeepSpeech2 ASR model.Comment: 5 pages, Published at IEEE ICASSP 202
Chronic-Pain Protective Behavior Detection with Deep Learning
In chronic pain rehabilitation, physiotherapists adapt physical activity to
patients' performance based on their expression of protective behavior,
gradually exposing them to feared but harmless and essential everyday
activities. As rehabilitation moves outside the clinic, technology should
automatically detect such behavior to provide similar support. Previous works
have shown the feasibility of automatic protective behavior detection (PBD)
within a specific activity. In this paper, we investigate the use of deep
learning for PBD across activity types, using wearable motion capture and
surface electromyography data collected from healthy participants and people
with chronic pain. We approach the problem by continuously detecting protective
behavior within an activity rather than estimating its overall presence. The
best performance reaches mean F1 score of 0.82 with leave-one-subject-out cross
validation. When protective behavior is modelled per activity type, performance
is mean F1 score of 0.77 for bend-down, 0.81 for one-leg-stand, 0.72 for
sit-to-stand, 0.83 for stand-to-sit, and 0.67 for reach-forward. This
performance reaches excellent level of agreement with the average experts'
rating performance suggesting potential for personalized chronic pain
management at home. We analyze various parameters characterizing our approach
to understand how the results could generalize to other PBD datasets and
different levels of ground truth granularity.Comment: 24 pages, 12 figures, 7 tables. Accepted by ACM Transactions on
Computing for Healthcar
Management of uveal tract melanoma: A comprehensive review
AbstractUveal tract melanoma is the most common primary intraocular malignancy in adults, accounting for about 5–10% of all the melanomas. Since there are no lymphatic vessels in the eye, uveal melanoma can only spread hematogenously leading to liver metastasis. A wide variety of treatment modalities are available for its management, leading to dilemma in selecting the appropriate therapy. This article reviews the diagnostic and therapeutic modalities available and thus, can help to individualize the treatment plan for each patient
Mic2Mic: Using Cycle-Consistent Generative Adversarial Networks to Overcome Microphone Variability in Speech Systems
Mobile and embedded devices are increasingly using microphones and
audio-based computational models to infer user context. A major challenge in
building systems that combine audio models with commodity microphones is to
guarantee their accuracy and robustness in the real-world. Besides many
environmental dynamics, a primary factor that impacts the robustness of audio
models is microphone variability. In this work, we propose Mic2Mic -- a
machine-learned system component -- which resides in the inference pipeline of
audio models and at real-time reduces the variability in audio data caused by
microphone-specific factors. Two key considerations for the design of Mic2Mic
were: a) to decouple the problem of microphone variability from the audio task,
and b) put a minimal burden on end-users to provide training data. With these
in mind, we apply the principles of cycle-consistent generative adversarial
networks (CycleGANs) to learn Mic2Mic using unlabeled and unpaired data
collected from different microphones. Our experiments show that Mic2Mic can
recover between 66% to 89% of the accuracy lost due to microphone variability
for two common audio tasks.Comment: Published at ACM IPSN 201
A Taxonomy of Noise in Voice Self-reports while Running
Smart earables offer great opportunities for conducting ubiquitous computing research. This paper shares its reflection on collecting self-reports from runners using the microphone on the smart eSense earbud device. Despite the advantages of the eSense in allowing researchers to collect continuous voice self-reports anytime anywhere, it also captured noise signals from various sources and created challenges in data processing and analysis. The paper presents an initial taxonomy of noise in runners’ voice self-reports data via eSense. This is based on a qualitative analysis of voice recordings based on eSense’s microphone with 11 runners across 14 in-the-wild running sessions. The paper discusses the details and characteristics of the observed noise, the challenges in achieving good-quality self-reports, and opportunities for extracting useful contextual information. The paper further suggests a noise-categorization API for the eSense or other similar platforms, not only for the purpose of noise-cancellation but also incorporating the mining of contextual information
Leveraging Activity Recognition to Enable Protective Behavior Detection in Continuous Data
Protective behavior exhibited by people with chronic pain (CP) during
physical activities is the key to understanding their physical and emotional
states. Existing automatic protective behavior detection (PBD) methods rely on
pre-segmentation of activities predefined by users. However, in real life,
people perform activities casually. Therefore, where those activities present
difficulties for people with chronic pain, technology-enabled support should be
delivered continuously and automatically adapted to activity type and
occurrence of protective behavior. Hence, to facilitate ubiquitous CP
management, it becomes critical to enable accurate PBD over continuous data. In
this paper, we propose to integrate human activity recognition (HAR) with PBD
via a novel hierarchical HAR-PBD architecture comprising graph-convolution and
long short-term memory (GC-LSTM) networks, and alleviate class imbalances using
a class-balanced focal categorical-cross-entropy (CFCC) loss. Through in-depth
evaluation of the approach using a CP patients' dataset, we show that the
leveraging of HAR, GC-LSTM networks, and CFCC loss leads to clear increase in
PBD performance against the baseline (macro F1 score of 0.81 vs. 0.66 and
precision-recall area-under-the-curve (PR-AUC) of 0.60 vs. 0.44). We conclude
by discussing possible use cases of the hierarchical architecture in CP
management and beyond. We also discuss current limitations and ways forward.Comment: Submitted to PACM IMWU